Go: migrate control flow graph to shared CFG library#21614
Go: migrate control flow graph to shared CFG library#21614
Conversation
…show you the current state of the file: Read ControlFlowGraphShared.qll, lines 1 to 50 The file compiles successfully. Here's a summary of what was implemented: ControlFlowGraphShared.qll — Shared CFG library instantiation for Go, wrapped in the GoCfg module to coexist with the existing CFG during transition. Key design decisions: Ast module (implements AstSig): Callable = FuncDef with a body IfStmt — only For if-statements without init (those with init use custom steps) LoopStmt — uses characteristic predicate pattern (not direct extension) to avoid return type conflicts with BlockStmt Empty classes for: WhileStmt, DoStmt, ForStmt, ForeachStmt, ThrowStmt, TryStmt, CatchClause, Switch, Case, ConditionalExpr, NullCoalescingExpr (Go doesn't have these constructs) LogicalAndExpr = LandExpr, LogicalOrExpr = LorExpr, LogicalNotExpr = NotExpr BooleanLiteral maps to true/false identifiers skipCfg filters out type expressions, non-rvalue references, parenthesized expressions, etc. Input module (implements InputSig1 + InputSig2): Labels: Go statement labels + fallthrough Custom steps for: if-with-init, for loops, range loops, expression/type switch, case clauses, select statements, defer, go statements Abrupt completions: panic via calls/div/deref/type-assert/index/conversion, fallthrough, goto End abrupt completions: labeled statements catch breaks, goto targets Remaining work (todos 4-6): The GoCfg module is self-contained and doesn't yet replace the existing CFG. Integration with BasicBlocks.qll, IR.qll, and the public ControlFlowGraph.qll API is the next phase — which is a larger effort requiring updates to DataFlow and all downstream consumers.
…red into Go: Created ControlFlowGraphShared.qll — contains the full GoCfg module with: Ast module implementing AstSig<Location> — maps Go's AST types to the shared interface, with empty types for constructs Go doesn't have (while/do/try/catch/throw/ternary/null-coalescing) Input module implementing InputSig1 + InputSig2 — labels, conditional contexts, custom step() predicates for if-with-init, for/range loops, expression/type switches, case clauses, select, defer, and go statements; abrupt completions for panics, fallthrough, and goto Make0/Make1/Make2 wiring and import Public Modified ControlFlowGraph.qll — added import of the shared library and re-exports it as module SharedCfg = GoCfg, making shared types accessible as SharedCfg::ControlFlowNode, SharedCfg::ControlFlow::EntryNode, SharedCfg::BasicBlock, etc. No breakage — all 90 src queries and library files compile cleanly.
Phase 2 progress — shared CFG library fully instantiated with IR support: ControlFlowGraphShared.qll (1290 lines) now contains: additionalNode entries for all 30+ synthetic IR operations: Assignment writes (assign:i), compound RHS (compound-rhs), tuple extraction (extract:i) Zero initialization (zero-init:i), increment/decrement (implicit-one, incdec-rhs) Return (return), result write/read (result-write:i, result-read:i) Function prologue (param-init:i, arg:i, result-init:i, result-zero-init:i) Range next (next), send (send), implicit deref (implicit-deref) Slice bounds (implicit-low/high/max), case checks (case-check:i) Type switch variables (type-switch-var), literal init (lit-init, lit-index) Implicit field selection (implicit-field:i) step routing for all constructs needing custom flow: Assignments (with epilogue node sequencing for extract/assign chains) Increment/decrement (operand → implicit-one → incdec-rhs → In) Return statements (expressions → extract → result-write → return) Index/slice/selector expressions (with implicit deref insertion) Composite literals (pre-order: allocation first, then element init) Send statements (channel → value → send) Plus the existing: if-with-init, for/range loops, switch/select, defer, go What remains for full migration (making shared types primary): Rewrite ControlFlowGraph.qll: change ControlFlow::Node from TControlFlowNode to GoCfg::ControlFlowNode Rewrite IR.qll (1598 lines): map all 30+ IR instruction classes to shared CFG nodes via getAstNode()/getIdTag() instead of Mk* types Update BasicBlocks.qll to use shared library's built-in BasicBlock Update ~30 downstream files (SSA, DataFlow, security queries) that reference IR::Instruction/ControlFlow::Node Add function prologue/epilogue step routing (param init, result init, defer LIFO at exit) Remove ControlFlowGraphImpl.qll This remaining work is a substantial rewrite (3000+ lines across 35+ files) that would benefit from being done incrementally with test validation at each step. Shall I continue with the IR.qll rewrite next?
…t fields, cleanup Agent-Logs-Url: https://github.com/github/codeql/sessions/011f8465-5ad3-44e2-9d84-f5314ea1afb2 Co-authored-by: owen-mc <62447351+owen-mc@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/codeql/sessions/011f8465-5ad3-44e2-9d84-f5314ea1afb2 Co-authored-by: owen-mc <62447351+owen-mc@users.noreply.github.com>
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
|
@copilot Please fix the following errors from the compiler: |
…In guards, mark implicitFieldSelection as additional Agent-Logs-Url: https://github.com/github/codeql/sessions/8d96128c-44db-4ba1-9eca-1a5b37f54442 Co-authored-by: owen-mc <62447351+owen-mc@users.noreply.github.com>
Fixed in 1db6bac:
|
|
@copilot Unlike java, go has AST nodes which aren't in a function definition, so we have to allow |
Agent-Logs-Url: https://github.com/github/codeql/sessions/2867f173-c3f1-4564-9885-3ff4517bbf30 Co-authored-by: owen-mc <62447351+owen-mc@users.noreply.github.com>
Fixed in c0a59ad. |
|
@copilot Please fix these warnings, as long as they are in go code and not shared libraries. |
… branches Agent-Logs-Url: https://github.com/github/codeql/sessions/12c0f9c8-1b85-4db4-b26f-8dbce818aa5a Co-authored-by: owen-mc <62447351+owen-mc@users.noreply.github.com>
Fixed in 66a191d. Added QLdoc comments to all 96 items in IR.qll — classes, member predicates, newtype branches, and classless predicates — matching the documentation from the old code where applicable. |
Adopts the shared
codeql/controlflowlibrary for Go's CFG implementation, replacing the bespokeControlFlowGraphImpl.qll(2133 lines) with a shared-library instantiation.Core architecture changes
ControlFlowGraphShared.qll(new): InstantiatesCfgLib::Make0/Make1/Make2with Go-specificAstandInputmodules. Wraps everything inGoCfgto isolate from the old API during transition.ControlFlowGraph.qll:ControlFlow::Nodenow extendsGoCfg::ControlFlowNodeinstead of the oldTControlFlowNodenewtype.BasicBlocks.qll:BasicBlockaliased toGoCfg::Cfg::BasicBlock.IR.qll: All 30+ instruction classes rewritten to identify themselves viaisAdditional/isIn/isBefore/isAfterpredicates instead ofMk*newtype constructors.ControlFlowGraphImpl.qll: fully superseded.File-level CFG roots
Unlike Java, Go has AST nodes (e.g. top-level
var,const,typedeclarations) that are not inside any function definition. To preserve CFG coverage for these nodes,Callableincludes bothFuncDefs with bodies andFiles with declarations.getEnclosingCallablefalls back tonode.getFile()for nodes not inside any function, andControlFlow::RootincludesFiles with declarations — matching the behavior of the oldControlFlowGraphImpl.qll.Function prologue/epilogue routing
Changed
callableGetBody(c) = c(instead ofc.getBody()) so the shared library routesEntry(fd) → Before(fd), allowingfuncDefStepto insert prologue and epilogue nodes:Before(fd)→arg:i→param-init:i(per parameter) →result-zero-init:j→result-init:j(per result var) →Before(body)After(body)→result-read:j(per result var) →After(fd)→NormalExitThis makes
InitParameterInstructionand related classes reachable in the CFG, which is required for SSA and data flow (e.g.,SsaParameterNodeviaIR::initParamInstruction).Implicit promoted-field reads
Fixed
selectorExprStepto route throughimplicit-field:iadditional nodes for promoted field accesses, makingImplicitFieldReadInstructionreachable. The chain descends from the outermost embedded field (highest index) to the innermost (index 1) before reachingIn(sel):Minor fixes
SsaImpl.qll: castbb.getNode(i)toControlFlow::Nodebefore callingisEntryNode().Revel.qll: usegetBaseInstruction()instead of the removedgetBase()onImplicitFieldReadInstruction.Original prompt
Continue
[Chronological Review: The conversation began with the user requesting a list of commits merged as part of a specific commit hash (e695477). Following that, the user asked to adopt a new shared CFG library for Go based on those commits. The user then prompted to "Continue to iterate?" multiple times, indicating a desire to progress with the implementation. The user also requested to "Do phase 2 now," suggesting a structured approach to the work. Throughout the conversation, there were indications of changes made to specific files, which required checking the current contents before proceeding with edits.][Intent Mapping:
[Technical Inventory:
[Code Archaeology:
[Progress Assessment:
[Context Validation: All critical information for continuation is captured, including the need to check file contents before making edits and the structured approach to the project.]
[Recent Commands Analysis:
1. Conversation Overview: - Primary Objectives: - "List commits merged as part of e695477" - Understanding changes related to a specific commit. - "Using those commits as a guideline, adopt the new shared CFG library for go." - Implementing a new library based on previous changes. - "Continue to iterate?" - Seeking confirmation to proceed with ongoing work. - "Do phase 2 now" - Moving to the next phase of the project. - Session Context: The conversation has focused on implementing changes to the Go control flow library, with multiple requests to continue iterating on the work. - User Intent Evolution: The user has consistently sought to progress through the implementation phases, indicating a structured approach to the project.- Technical Foundation:
- GitHub CodeQL: A tool for code analysis being used to facilitate the implementation of changes.
- Files:
- IR.qll: The current working file for implementing changes.
- ControlFlowGraph.qll and ControlFlowGraphShared.qll: Files mentioned as having recent edits that need to be checked.
- Codebase Status:
- IR.qll:
- Purpose: Central file for implementing the new CFG library.
- Current State: User is actively working on this file.
- Key Code Segments: Specific functions or classes were not detailed in the conversation.
- Dependencies: Relates to the overall control flow implementation in Go.
- ControlFlowGraph.qll and ControlFlowGraphShared.qll:
- Purpose: Supporting files for the control flow library.
- Current State: Recent edits made, requiring review before further changes.
- Problem Resolution:
- Issues Encountered: Need to check file contents before making edits due to recent changes.
- Solutions Implemented: User has been iterating through the implementation process.
- Debugging Context: No specific debugging issues were mentioned, but the user is cautious about recent changes.
- Lessons Learned: Importance of reviewing file changes before proceeding with edits.
- Progress Tracking:
- Completed Tasks: User has identified the need for ...
Created from VS Code.